Overview

Dataset statistics

Number of variables11
Number of observations856
Missing cells1288
Missing cells (%)13.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory73.7 KiB
Average record size in memory88.1 B

Variable types

Categorical4
Numeric7

Warnings

Country has a high cardinality: 107 distinct values High cardinality
No. of cases has a high cardinality: 695 distinct values High cardinality
No. of deaths has a high cardinality: 519 distinct values High cardinality
No. of cases_median is highly correlated with No. of cases_min and 4 other fieldsHigh correlation
No. of cases_min is highly correlated with No. of cases_median and 4 other fieldsHigh correlation
No. of cases_max is highly correlated with No. of cases_median and 4 other fieldsHigh correlation
No. of deaths_median is highly correlated with No. of cases_median and 4 other fieldsHigh correlation
No. of deaths_min is highly correlated with No. of cases_median and 4 other fieldsHigh correlation
No. of deaths_max is highly correlated with No. of cases_median and 4 other fieldsHigh correlation
No. of cases_min has 312 (36.4%) missing values Missing
No. of cases_max has 312 (36.4%) missing values Missing
No. of deaths_min has 332 (38.8%) missing values Missing
No. of deaths_max has 332 (38.8%) missing values Missing
Country is uniformly distributed Uniform
No. of cases_median has 136 (15.9%) zeros Zeros
No. of deaths_median has 265 (31.0%) zeros Zeros
No. of deaths_min has 58 (6.8%) zeros Zeros

Reproduction

Analysis started2021-03-08 13:26:18.473654
Analysis finished2021-03-08 13:26:30.141782
Duration11.67 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

Country
Categorical

HIGH CARDINALITY
UNIFORM

Distinct107
Distinct (%)12.5%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
Cabo Verde
 
8
Mauritania
 
8
Belize
 
8
Armenia
 
8
Timor-Leste
 
8
Other values (102)
816 

Length

Max length37
Median length8
Mean length10.11214953
Min length4

Characters and Unicode

Total characters8656
Distinct characters54
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAfghanistan
2nd rowAlgeria
3rd rowAngola
4th rowArgentina
5th rowArmenia
ValueCountFrequency (%)
Cabo Verde8
 
0.9%
Mauritania8
 
0.9%
Belize8
 
0.9%
Armenia8
 
0.9%
Timor-Leste8
 
0.9%
Thailand8
 
0.9%
Cameroon8
 
0.9%
Haiti8
 
0.9%
Solomon Islands8
 
0.9%
Papua New Guinea8
 
0.9%
Other values (97)776
90.7%
Histogram of lengths of the category
ValueCountFrequency (%)
republic80
 
6.4%
of56
 
4.5%
guinea24
 
1.9%
democratic24
 
1.9%
united16
 
1.3%
congo16
 
1.3%
arab16
 
1.3%
people's16
 
1.3%
south16
 
1.3%
sudan16
 
1.3%
Other values (121)976
77.7%

Most occurring characters

ValueCountFrequency (%)
a1208
 
14.0%
i792
 
9.1%
e608
 
7.0%
n576
 
6.7%
o504
 
5.8%
r432
 
5.0%
400
 
4.6%
u344
 
4.0%
l312
 
3.6%
t296
 
3.4%
Other values (44)3184
36.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter6968
80.5%
Uppercase Letter1200
 
13.9%
Space Separator400
 
4.6%
Open Punctuation24
 
0.3%
Close Punctuation24
 
0.3%
Other Punctuation24
 
0.3%
Dash Punctuation16
 
0.2%

Most frequent character per category

ValueCountFrequency (%)
a1208
17.3%
i792
11.4%
e608
 
8.7%
n576
 
8.3%
o504
 
7.2%
r432
 
6.2%
u344
 
4.9%
l312
 
4.5%
t296
 
4.2%
m224
 
3.2%
Other values (17)1672
24.0%
ValueCountFrequency (%)
S120
 
10.0%
C96
 
8.0%
R96
 
8.0%
A88
 
7.3%
B88
 
7.3%
P80
 
6.7%
G80
 
6.7%
M72
 
6.0%
E64
 
5.3%
T64
 
5.3%
Other values (12)352
29.3%
ValueCountFrequency (%)
400
100.0%
ValueCountFrequency (%)
(24
100.0%
ValueCountFrequency (%)
)24
100.0%
ValueCountFrequency (%)
'24
100.0%
ValueCountFrequency (%)
-16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8168
94.4%
Common488
 
5.6%

Most frequent character per script

ValueCountFrequency (%)
a1208
14.8%
i792
 
9.7%
e608
 
7.4%
n576
 
7.1%
o504
 
6.2%
r432
 
5.3%
u344
 
4.2%
l312
 
3.8%
t296
 
3.6%
m224
 
2.7%
Other values (39)2872
35.2%
ValueCountFrequency (%)
400
82.0%
(24
 
4.9%
)24
 
4.9%
'24
 
4.9%
-16
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII8648
99.9%
None8
 
0.1%

Most frequent character per block

ValueCountFrequency (%)
a1208
 
14.0%
i792
 
9.2%
e608
 
7.0%
n576
 
6.7%
o504
 
5.8%
r432
 
5.0%
400
 
4.6%
u344
 
4.0%
l312
 
3.6%
t296
 
3.4%
Other values (43)3176
36.7%
ValueCountFrequency (%)
ô8
100.0%

Year
Real number (ℝ≥0)

Distinct8
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2013.5
Minimum2010
Maximum2017
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum2010
5-th percentile2010
Q12011.75
median2013.5
Q32015.25
95-th percentile2017
Maximum2017
Range7
Interquartile range (IQR)3.5

Descriptive statistics

Standard deviation2.29262739
Coefficient of variation (CV)0.001138627956
Kurtosis-1.238315402
Mean2013.5
Median Absolute Deviation (MAD)2
Skewness0
Sum1723556
Variance5.256140351
MonotocityDecreasing
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2010107
12.5%
2011107
12.5%
2012107
12.5%
2013107
12.5%
2014107
12.5%
2015107
12.5%
2016107
12.5%
2017107
12.5%
ValueCountFrequency (%)
2010107
12.5%
2011107
12.5%
2012107
12.5%
2013107
12.5%
2014107
12.5%
ValueCountFrequency (%)
2017107
12.5%
2016107
12.5%
2015107
12.5%
2014107
12.5%
2013107
12.5%

No. of cases
Categorical

HIGH CARDINALITY

Distinct695
Distinct (%)81.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
136 
3
 
6
7
 
4
1
 
4
19
 
3
Other values (690)
703 

Length

Max length27
Median length18
Mean length14.21962617
Min length1

Characters and Unicode

Total characters12172
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique679 ?
Unique (%)79.3%

Sample

1st row630308[495000-801000]
2nd row0
3rd row4615605[3106000-6661000]
4th row0
5th row0
ValueCountFrequency (%)
0136
 
15.9%
36
 
0.7%
74
 
0.5%
14
 
0.5%
193
 
0.4%
43
 
0.4%
63
 
0.4%
22
 
0.2%
122
 
0.2%
222
 
0.2%
Other values (685)691
80.7%
Histogram of lengths of the category
ValueCountFrequency (%)
0136
 
15.9%
36
 
0.7%
74
 
0.5%
14
 
0.5%
63
 
0.4%
193
 
0.4%
43
 
0.4%
122
 
0.2%
152
 
0.2%
812
 
0.2%
Other values (685)691
80.7%

Most occurring characters

ValueCountFrequency (%)
03950
32.5%
11063
 
8.7%
2829
 
6.8%
3794
 
6.5%
4746
 
6.1%
5687
 
5.6%
6655
 
5.4%
7624
 
5.1%
8615
 
5.1%
9577
 
4.7%
Other values (3)1632
13.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number10540
86.6%
Open Punctuation544
 
4.5%
Dash Punctuation544
 
4.5%
Close Punctuation544
 
4.5%

Most frequent character per category

ValueCountFrequency (%)
03950
37.5%
11063
 
10.1%
2829
 
7.9%
3794
 
7.5%
4746
 
7.1%
5687
 
6.5%
6655
 
6.2%
7624
 
5.9%
8615
 
5.8%
9577
 
5.5%
ValueCountFrequency (%)
[544
100.0%
ValueCountFrequency (%)
-544
100.0%
ValueCountFrequency (%)
]544
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common12172
100.0%

Most frequent character per script

ValueCountFrequency (%)
03950
32.5%
11063
 
8.7%
2829
 
6.8%
3794
 
6.5%
4746
 
6.1%
5687
 
5.6%
6655
 
5.4%
7624
 
5.1%
8615
 
5.1%
9577
 
4.7%
Other values (3)1632
13.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII12172
100.0%

Most frequent character per block

ValueCountFrequency (%)
03950
32.5%
11063
 
8.7%
2829
 
6.8%
3794
 
6.5%
4746
 
6.1%
5687
 
5.6%
6655
 
5.4%
7624
 
5.1%
8615
 
5.1%
9577
 
4.7%
Other values (3)1632
13.4%

No. of deaths
Categorical

HIGH CARDINALITY

Distinct519
Distinct (%)60.6%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
257 
1
 
15
0[0-1]
 
8
1[0-2]
 
7
2[0-4]
 
7
Other values (514)
562 

Length

Max length21
Median length8
Mean length8.373831776
Min length1

Characters and Unicode

Total characters7168
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique486 ?
Unique (%)56.8%

Sample

1st row298[110-510]
2nd row0
3rd row13316[9970-16600]
4th row0
5th row0
ValueCountFrequency (%)
0257
30.0%
115
 
1.8%
0[0-1]8
 
0.9%
1[0-2]7
 
0.8%
2[0-4]7
 
0.8%
26
 
0.7%
1[0-3]6
 
0.7%
46
 
0.7%
5[1-9]4
 
0.5%
104
 
0.5%
Other values (509)536
62.6%
Histogram of lengths of the category
ValueCountFrequency (%)
0257
30.0%
115
 
1.8%
0[0-18
 
0.9%
2[0-47
 
0.8%
1[0-27
 
0.8%
1[0-36
 
0.7%
46
 
0.7%
26
 
0.7%
5[1-94
 
0.5%
104
 
0.5%
Other values (509)536
62.6%

Most occurring characters

ValueCountFrequency (%)
01710
23.9%
1784
10.9%
2555
 
7.7%
[524
 
7.3%
-524
 
7.3%
]524
 
7.3%
4417
 
5.8%
3412
 
5.7%
6386
 
5.4%
5361
 
5.0%
Other values (3)971
13.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5596
78.1%
Open Punctuation524
 
7.3%
Dash Punctuation524
 
7.3%
Close Punctuation524
 
7.3%

Most frequent character per category

ValueCountFrequency (%)
01710
30.6%
1784
14.0%
2555
 
9.9%
4417
 
7.5%
3412
 
7.4%
6386
 
6.9%
5361
 
6.5%
7345
 
6.2%
8320
 
5.7%
9306
 
5.5%
ValueCountFrequency (%)
[524
100.0%
ValueCountFrequency (%)
-524
100.0%
ValueCountFrequency (%)
]524
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common7168
100.0%

Most frequent character per script

ValueCountFrequency (%)
01710
23.9%
1784
10.9%
2555
 
7.7%
[524
 
7.3%
-524
 
7.3%
]524
 
7.3%
4417
 
5.8%
3412
 
5.7%
6386
 
5.4%
5361
 
5.0%
Other values (3)971
13.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII7168
100.0%

Most frequent character per block

ValueCountFrequency (%)
01710
23.9%
1784
10.9%
2555
 
7.7%
[524
 
7.3%
-524
 
7.3%
]524
 
7.3%
4417
 
5.8%
3412
 
5.7%
6386
 
5.4%
5361
 
5.0%
Other values (3)971
13.5%

No. of cases_median
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct694
Distinct (%)81.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2081990.36
Minimum0
Maximum62020888
Zeros136
Zeros (%)15.9%
Memory size6.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q1238.5
median37521
Q31656628.25
95-th percentile8479671.5
Maximum62020888
Range62020888
Interquartile range (IQR)1656389.75

Descriptive statistics

Standard deviation6381892.5
Coefficient of variation (CV)3.065284366
Kurtosis55.06562959
Mean2081990.36
Median Absolute Deviation (MAD)37521
Skewness6.830041648
Sum1782183748
Variance4.072855188 × 1013
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0136
 
15.9%
36
 
0.7%
14
 
0.5%
74
 
0.5%
43
 
0.4%
63
 
0.4%
193
 
0.4%
342
 
0.2%
2422
 
0.2%
812
 
0.2%
Other values (684)691
80.7%
ValueCountFrequency (%)
0136
15.9%
14
 
0.5%
22
 
0.2%
36
 
0.7%
43
 
0.4%
ValueCountFrequency (%)
620208881
0.1%
615871351
0.1%
607493491
0.1%
605294561
0.1%
593650391
0.1%

No. of cases_min
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct448
Distinct (%)82.4%
Missing312
Missing (%)36.4%
Infinite0
Infinite (%)0.0%
Mean2157556.342
Minimum30
Maximum43880000
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum30
5-th percentile1200
Q139000
median498000
Q32084500
95-th percentile7118900
Maximum43880000
Range43879970
Interquartile range (IQR)2045500

Descriptive statistics

Standard deviation5384821.887
Coefficient of variation (CV)2.495796649
Kurtosis37.59826868
Mean2157556.342
Median Absolute Deviation (MAD)490800
Skewness5.713503985
Sum1173710650
Variance2.899630676 × 1013
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
330006
 
0.7%
210005
 
0.6%
220004
 
0.5%
600004
 
0.5%
110003
 
0.4%
190003
 
0.4%
6060003
 
0.4%
150003
 
0.4%
490003
 
0.4%
1450003
 
0.4%
Other values (438)507
59.2%
(Missing)312
36.4%
ValueCountFrequency (%)
301
0.1%
1202
0.2%
2301
0.1%
3602
0.2%
4001
0.1%
ValueCountFrequency (%)
438800001
0.1%
438000001
0.1%
435100001
0.1%
433100001
0.1%
411800001
0.1%

No. of cases_max
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct481
Distinct (%)88.4%
Missing312
Missing (%)36.4%
Infinite0
Infinite (%)0.0%
Mean4913740.919
Minimum40
Maximum84840000
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum40
5-th percentile1715
Q175000
median1389000
Q35277750
95-th percentile15450000
Maximum84840000
Range84839960
Interquartile range (IQR)5202750

Descriptive statistics

Standard deviation11027731.66
Coefficient of variation (CV)2.244263962
Kurtosis30.85611056
Mean4913740.919
Median Absolute Deviation (MAD)1379200
Skewness5.127501966
Sum2673075060
Variance1.216108656 × 1014
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
120005
 
0.6%
122300004
 
0.5%
10003
 
0.4%
150003
 
0.4%
160003
 
0.4%
340003
 
0.4%
100003
 
0.4%
750003
 
0.4%
1440003
 
0.4%
270003
 
0.4%
Other values (471)511
59.7%
(Missing)312
36.4%
ValueCountFrequency (%)
401
0.1%
1601
0.1%
1701
0.1%
4001
0.1%
4301
0.1%
ValueCountFrequency (%)
848400001
0.1%
838000001
0.1%
832400001
0.1%
827000001
0.1%
815800001
0.1%

No. of deaths_median
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct447
Distinct (%)52.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4713.880841
Minimum0
Maximum146734
Zeros265
Zeros (%)31.0%
Memory size6.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median55.5
Q34096
95-th percentile20372.5
Maximum146734
Range146734
Interquartile range (IQR)4096

Descriptive statistics

Standard deviation13183.31289
Coefficient of variation (CV)2.796700497
Kurtosis51.6589746
Mean4713.880841
Median Absolute Deviation (MAD)55.5
Skewness6.369920518
Sum4035082
Variance173799738.7
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0265
31.0%
132
 
3.7%
221
 
2.5%
39
 
1.1%
59
 
1.1%
79
 
1.1%
108
 
0.9%
48
 
0.9%
65
 
0.6%
344
 
0.5%
Other values (437)486
56.8%
ValueCountFrequency (%)
0265
31.0%
132
 
3.7%
221
 
2.5%
39
 
1.1%
48
 
0.9%
ValueCountFrequency (%)
1467341
0.1%
1365331
0.1%
1252901
0.1%
1164721
0.1%
1078431
0.1%

No. of deaths_min
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct255
Distinct (%)48.7%
Missing332
Missing (%)38.8%
Infinite0
Infinite (%)0.0%
Mean5619.108779
Minimum0
Maximum115000
Zeros58
Zeros (%)6.8%
Memory size6.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q15
median390
Q36592.5
95-th percentile19500
Maximum115000
Range115000
Interquartile range (IQR)6587.5

Descriptive statistics

Standard deviation12823.71424
Coefficient of variation (CV)2.282161593
Kurtosis31.70327045
Mean5619.108779
Median Absolute Deviation (MAD)390
Skewness5.036093056
Sum2944413
Variance164447646.9
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
058
 
6.8%
125
 
2.9%
220
 
2.3%
315
 
1.8%
1013
 
1.5%
6010
 
1.2%
809
 
1.1%
709
 
1.1%
209
 
1.1%
59
 
1.1%
Other values (245)347
40.5%
(Missing)332
38.8%
ValueCountFrequency (%)
058
6.8%
125
2.9%
220
 
2.3%
315
 
1.8%
47
 
0.8%
ValueCountFrequency (%)
1150001
0.1%
1070001
0.1%
981001
0.1%
912001
0.1%
846001
0.1%

No. of deaths_max
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct336
Distinct (%)64.1%
Missing332
Missing (%)38.8%
Infinite0
Infinite (%)0.0%
Mean10149.42939
Minimum1
Maximum179000
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum1
5-th percentile3.15
Q1180
median3565
Q312400
95-th percentile41670
Maximum179000
Range178999
Interquartile range (IQR)12220

Descriptive statistics

Standard deviation20173.78393
Coefficient of variation (CV)1.987676662
Kurtosis28.83634105
Mean10149.42939
Median Absolute Deviation (MAD)3545
Skewness4.759147803
Sum5318301
Variance406981558.2
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2012
 
1.4%
111
 
1.3%
49
 
1.1%
39
 
1.1%
109
 
1.1%
57
 
0.8%
1807
 
0.8%
27
 
0.8%
607
 
0.8%
7606
 
0.7%
Other values (326)440
51.4%
(Missing)332
38.8%
ValueCountFrequency (%)
111
1.3%
27
0.8%
39
1.1%
49
1.1%
57
0.8%
ValueCountFrequency (%)
1790001
0.1%
1660001
0.1%
1520001
0.1%
1420001
0.1%
1310001
0.1%

WHO Region
Categorical

Distinct6
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
Africa
344 
Americas
168 
Eastern Mediterranean
112 
Western Pacific
88 
Europe
72 

Length

Max length21
Median length8
Mean length10.03738318
Min length6

Characters and Unicode

Total characters8592
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEastern Mediterranean
2nd rowAfrica
3rd rowAfrica
4th rowAmericas
5th rowEurope
ValueCountFrequency (%)
Africa344
40.2%
Americas168
19.6%
Eastern Mediterranean112
 
13.1%
Western Pacific88
 
10.3%
Europe72
 
8.4%
South-East Asia72
 
8.4%
Histogram of lengths of the category
ValueCountFrequency (%)
africa344
30.5%
americas168
14.9%
mediterranean112
 
9.9%
eastern112
 
9.9%
pacific88
 
7.8%
western88
 
7.8%
asia72
 
6.4%
europe72
 
6.4%
south-east72
 
6.4%

Most occurring characters

ValueCountFrequency (%)
a1080
12.6%
r1008
11.7%
i872
10.1%
e864
10.1%
c688
8.0%
A584
 
6.8%
s512
 
6.0%
t456
 
5.3%
f432
 
5.0%
n424
 
4.9%
Other values (13)1672
19.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7048
82.0%
Uppercase Letter1200
 
14.0%
Space Separator272
 
3.2%
Dash Punctuation72
 
0.8%

Most frequent character per category

ValueCountFrequency (%)
a1080
15.3%
r1008
14.3%
i872
12.4%
e864
12.3%
c688
9.8%
s512
7.3%
t456
6.5%
f432
 
6.1%
n424
 
6.0%
m168
 
2.4%
Other values (5)544
7.7%
ValueCountFrequency (%)
A584
48.7%
E256
21.3%
M112
 
9.3%
W88
 
7.3%
P88
 
7.3%
S72
 
6.0%
ValueCountFrequency (%)
272
100.0%
ValueCountFrequency (%)
-72
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8248
96.0%
Common344
 
4.0%

Most frequent character per script

ValueCountFrequency (%)
a1080
13.1%
r1008
12.2%
i872
10.6%
e864
10.5%
c688
8.3%
A584
7.1%
s512
 
6.2%
t456
 
5.5%
f432
 
5.2%
n424
 
5.1%
Other values (11)1328
16.1%
ValueCountFrequency (%)
272
79.1%
-72
 
20.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII8592
100.0%

Most frequent character per block

ValueCountFrequency (%)
a1080
12.6%
r1008
11.7%
i872
10.1%
e864
10.1%
c688
8.0%
A584
 
6.8%
s512
 
6.0%
t456
 
5.3%
f432
 
5.0%
n424
 
4.9%
Other values (13)1672
19.5%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

CountryYearNo. of casesNo. of deathsNo. of cases_medianNo. of cases_minNo. of cases_maxNo. of deaths_medianNo. of deaths_minNo. of deaths_maxWHO Region
0Afghanistan2017630308[495000-801000]298[110-510]630308495000.0801000.0298110.0510.0Eastern Mediterranean
1Algeria2017000NaNNaN0NaNNaNAfrica
2Angola20174615605[3106000-6661000]13316[9970-16600]46156053106000.06661000.0133169970.016600.0Africa
3Argentina2017000NaNNaN0NaNNaNAmericas
4Armenia2017000NaNNaN0NaNNaNEurope
5Azerbaijan2017000NaNNaN0NaNNaNEurope
6Bangladesh201732924[30000-36000]76[3-130]3292430000.036000.0763.0130.0South-East Asia
7Belize2017707NaNNaN0NaNNaNAmericas
8Benin20174111699[2774000-6552000]7328[5740-8920]41116992774000.06552000.073285740.08920.0Africa
9Bhutan201711011NaNNaN0NaNNaNSouth-East Asia

Last rows

CountryYearNo. of casesNo. of deathsNo. of cases_medianNo. of cases_minNo. of cases_maxNo. of deaths_medianNo. of deaths_minNo. of deaths_maxWHO Region
846Uganda201011503116[7618000-17700000]21558[17200-26000]115031167618000.017700000.02155817200.026000.0Africa
847United Arab Emirates2010000NaNNaN0NaNNaNEastern Mediterranean
848United Republic of Tanzania20106545932[3955000-9995000]20281[17600-23000]65459323955000.09995000.02028117600.023000.0Africa
849Uzbekistan2010303NaNNaN0NaNNaNEurope
850Vanuatu201015695[12000-20000]20[2-40]1569512000.020000.0202.040.0Western Pacific
851Venezuela (Bolivarian Republic of)201057257[47000-74000]52[9-90]5725747000.074000.0529.090.0Americas
852Viet Nam201023062[21000-26000]45[2-80]2306221000.026000.0452.080.0Western Pacific
853Yemen20101134927[611000-2686000]2874[90-8490]1134927611000.02686000.0287490.08490.0Eastern Mediterranean
854Zambia20102169307[1449000-3095000]6544[5580-7510]21693071449000.03095000.065445580.07510.0Africa
855Zimbabwe20101095083[606000-1717000]2803[80-6190]1095083606000.01717000.0280380.06190.0Africa